Improving speech synthesis for noisy environments
نویسندگان
چکیده
Speech Synthesizers have traditionally been built on carefully read speech that is recorded in studio environment. Such voices are suboptimal for use in noisy conditions, which is inevitable in a majority of deployed speech systems. In this work, we attempt to modify the output of the speech synthesizers to make it more appropriate for noisy environments. Comparison of spectral and prosodic features of speech in noise and results of some conversion techniques are presented.
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملMLP network for enhancement of noisy MFCC vectors
The performance of voice dialling systems often degrades rapidly as the intensity of the background noise increases. In this paper, we describe a neural network based speech enhancement technique for improving the speech recognition performance of a voice dialling system in very noisy real world type conditions. The speech samples were recorded in laboratory conditions and afterwards corrupted ...
متن کاملSpeech Production in Noisy Environments and the Effect on Automatic Speech Recognition
Speech is bimodal in nature and includes the audio and visual modalities. In addition to acoustic speech perception, speech can be also perceived using visual information provided by the mouth/face (i.e., automatic lipreading). In this study, the visual speech production in noisy environments is investigated. The authors show that the Lombard effect plays an important role not only in audio spe...
متن کاملThe Efficient Pmc for Robust Speech Recognition in Noisy Environments
The environment adaptive methods play an important part in improving the robustness of automatic speech recognition. In this paper, PMC is reviewed and improved to achieve the better performance. The experiments have been done based on the Cambridge’s HTK toolkit to implement the continuous Mandarin digit recognition in noisy environments
متن کامل